Recommendations

In this section, I crystallise insights from all the analytical sections below to provide a set of recommendations on how Monzo should measure, monitor, and potentially increase retention.

Introduction

In this notebook, I attempt to answer three importantquestions regarding retention of users on a platform like Monzo.

Question 1: How would you define user retention for a business like ours and why?

For a business like Monzo, user retention can be analyzed through metric families derived from key actions on the platform aggregated over different time scales. I illustrate by considering two key actions – opening the Monzo app and completing a transaction. From these I derive 3 metric families that I aggregate daily, weekly, and monthly.

The first two look at user retention through 5 metrics that I define below:

1. Active Users – This is the number of users who performed that action on a given unit of time – daily, weekly, or monthly.
2. Retained Users – This is the number of users who performed that action on two consecutive units of time.
3. Churned Users – This is the number of users who performed that action on the previous time unit, but not the next one.
4. Resurrected Users – This is the number of users who performed that action on the given unit of time but did not do so on the previous one and this was not their time unit of first activity.
5. First Active Users – This is the number of users who performed that action for the very first time on that particular time unit.

These can be understood as a decomposition of the total user base using the following equations:

\(Total\;Users = Active\; Users + Churned\; Users\)
\(Active\; Users = Retained\; Users + Resurrected\; Users + First\; Active\; Users\)

This decomposition allows us to track user retention on the platform for key actions and identify behavior of the user base. I shall illustrate this through examples in the sections below.

1. App Opens

Daily

In this section, I illustrate user retention metrics discussed above for unique users who opened the app on a given day. The plot below shows the actual number as well as a trend-line for all 5 metrics discussed above. The calculation is done with granularity of a single day.

` From the plot above we can observe the following:

  • Active users who opened the app at least once on a given day seems to be growing steadily as indicated by the trend line. There seems to be prominent spikes during the first week of the month. This suggests that many users open the app to pay rent, utilities, or settle accounts with friends/room mates. The spikes are also getting larger with time indicating that newly acquired users are also opening the app with the same frequency and probably for the same reasons.

  • Resurrected users seem to be biggest contributor to active users followed by retained users and first active users. This indicates that very few users open the app on consecutive days. Further, active user growth is not merely coming from newly active users, rather users returning to the platform after being inactive for at least a day. This is a good sign that users are actually using the platform, just not daily. This is to be expected since the Monzo app is not meant to be used daily like social media apps.

  • This plot is useful to monitor acute issues with the platform like outages. We can see that there is a somewhat prominent dip in active users below the trend-line in the week beginning on the 23rd of May and the 6th of June. While the dip is not prominent enough to suggest a platform wide issue, it would be nonetheless prudent to investigate the cause of the dip.

Weekly

Below I plot these 5 metrics with a time granularity of a week. That is for active users are the total number of unique users who opened the app at least once in a week, retained users are those who did so on two consecutive weeks and so on.

`

From the plot above we can observe the following:

  • Weekly active users is much higher in number than daily active users and the actual line is smoother, almost hugging the trend line. This suggests that most users, certainly more than those opening the app daily, open the app at least once a week. We can still observe small spikes in app usage for the first week of every month, which adds weight to my hypothesis in the last section on why.

  • Retained users seems to be the largest contributor followed by resurrected users and first active users. This is unlike the previous plot, where daily retained users contributed far less. This further suggests that most users open the app at least once a week and the platform is doing a good job of bringing them back; hence the lower number of churned users compared to the daily plot. The plateauing of first active users suggest that growth from new users is steady week on week. This may indicate an issue with the user acquisition strategy since we would ideally expect to be acquiring more users week on week through various channels.

  • This plot is a good candidate to investigate chronic issues with the platform. From the plot we can observe s dip below the trend-line from the 30th of April till 28th May. While this maybe just recursion to the mean after the expected spike in the beginning of May, it is nonetheless prudent to ensure nothing is amiss.

Monthly

In this section, I use a time granularity of 1 month. Therefore monthly active users are the number of unique users who opened the app at least once in a given month.

From this plot we can observe the following:

  • Monthly active users is higher than weekly active users, as we would expect, and the actual line is virtually indistinguishable from the trend-line. This suggests that there is very little variation month to month away from the trend line and growth in app opens is following a linear trend.

  • Retained users make the largest contribution to active users followed by first active users and then resurrected users. This adds weights to our earlier conclusion that the platform is doing a good job bringing back customers to the app. Just as stated earlier the plateauing of first active users suggests that the user acquisition strategy may not be working as well as we would like. The increase in churned users over time may also suggest that the newly acquired users may not be opening the app as much.

  • This plot can be useful to monitor longer term trends in app usage. Given that the data is not very noisy, any fluctuation from the trend line, if unexpected, suggests something has materially changed and a point of further inquiry.

2. Transacting Users

In this section, I propose measuring user retention using transaction data.

Daily

In this section, I illustrate user retention metrics for unique users who completed at least one transaction in a day. The plot below shows the actual number as well as a trend-line for all 5 metrics. The calculation is done with granularity of a single day.

From the plot above we can observe the following:

  • Daily transacting users are a lot more than daily app opens. This makes sense since completing transactions captures better why people sign up to Monzo. So daily transacting users is perhaps a better way to understand user retention. We can also observe a consistent weekly seasonality with transactions peaking from Thursday to Saturday and then bottoming out on Sundays. This suggests that users primarily use Monzo to pay at pubs, restaurants or other leisure activities over the weekend.

  • Retained users make the largest contribution to active users, which differs from app opens. This furthers adds weight to the argument that this is a better metric to measure user retention for a business like Monzo.

  • Growth in first active users is very little to non-existent. This maybe a cause for concern and a reason to examine the user acquisition strategy carefully.

Weekly and Monthly

In this section I show transacting user retention aggregated on a weekly and monthly basis. I group these together since the inference from these plots are pretty much the same.

From these plots we can observe the following:

  • The trend for weekly and monthly transacting users is strongly upward and shows little variance; it more or less hugs the trend line. These plots confirm the inference from above that this is a good measure of user retention. Monzo users find value in the product primarily through completing transactions. These plots are the best way to monitor user retention; any deviation from the trend line should suggest some major change and worth investigating.

  • As observed in all the plots above the flat line for first active users should be something investigated.

3. Transaction Amount and Number of Transactions

In this section, I examine the transaction amount and number of transactions completed by users. As an aside, I also investigated the type of transactions – credit and debit. Most transactions, in volume and total amount, are debit transactions (I do not show the plots since it’s not relevant to the question). This suggests that users primarily use Monzo for purchases of products and services and less as their salary account. While this is out of scope of this analysis, it is worth flagging since this might suggest a way to increase user engagement with the platform. In the rest of the notebook I only look at the absolute value of transactions and do not differentiate between credit and debit.

In order to understand user retention, I look at the change in average transaction value and number of transaction per transacting user. I only consider weekly and monthly time units since it is evident from the above sections that these illustrate user patterns better.

From the plots above, we can observe that the average transaction amount and number of transactions per transacting user have increased significantly over time; monthly transaction amount has doubled. This indicates that users are finding increasing value in Monzo. This is also a sign that users are being retained better. A dip in this value would indicate otherwise and should be investigated.

Conclusion

I have proposed three metric families and illustrated their utility in understanding user retention for a business like Monzo. It is tempting to rely on a single metric to measure something as critical as user retention, but I would caution against that. This is because once we pick one metric and various teams work to optimise that metric, as per Goodhart’s Law, it stops being a good measure. It is with this in mind that I have proposed three metric families that capture different aspects of user behavior on Monzo. By monitoring all three metrics, we can be sure not to overly rely on any single one and potentially suffer the consequences of it becoming a bad measure for user retention.

Q2: How can we analyse retention with the smallest delay in time and as unskewed as possible, i.e. how can we identify an underperforming cohort as quickly as possible?

In the previous section, I proposed three different metric families that measure overall user behavior on the platform and are useful way to monitor user retention on the platform. However they can only indicate that there maybe a problem overall, but cannot diagnose where exactly the problem maybe. In this answer I propose a few different ways to constitute user cohorts and analyse their usage of Monzo. These cohorts will allow us to identify quickly and without much skew where the problem may rest.

I propose constituting cohorts of two different types – 1.) Month of activation and 2.) Age of users. Ideally, I would have also liked to cohort users on the basis of their acquisition channel, but that data is not available. I analyze these cohorts by comparing their behavior for the first 24 weeks since activation. Since users were activated at different times and I have data for only 8 months, not all user cohorts will have stable values for the entire period. Nonetheless, this is meant to be illustrative of the method.

Month of Activation User Cohorts

In this section, I group users into cohorts based on their month of activation with the assumption that they may have been acquired through similar means and thus constitute a meaningful group. As mentioned earlier, I would have preferred to group them by the acquisition channel, but in the absence of that data this serves as a reasonable proxy. I will analyse the key actions – number of transactions, transaction amount, and app opens. For each of these I look at % of users in the cohort that completed these actions every week, the mean as well as the median value so as to detect any skew within the cohort. I will only use time unit of 1 week since it was evident from previous analysis that this best represents how users prefer to perform these key actions.

First I begin by plotting the cohort sizes below.

From the plot below, we can say that except for the November, December, and June cohorts, all the others are of comparable sizes. The exceptions should be kept in mind for when we notice comparatively higher variance due to the smaller cohort sizes. Further, I exclude the June cohort from analysis since they have not had enough time on the platform (less than 2 weeks).

Monthly Cohorts and Transactions

In the plots below I examine the transaction behavior of these cohorts. I produce 5 plots that look at 1.) the % of user in cohort who completed a transaction, 2.) the average number of transactions, 3.) the number of transactions completed by the median user, 4.) average transaction amount, and 5.) transaction amount of median user.

From these plots we can observe the following:

  • The first thing that stands out is the prominent dip on all three plots for the last three weeks of each cohort, except the November and December ones. This is because the plot does not show the last three weeks for these two cohorts and, indeed, they also display a similar drop. This suggests a platform wide problem that is not specific to any cohort. This is a manifestation of an earlier trend I pointed out in the first section of this analysis. Investigating the cause of this drop is out of scope of this analysis.

  • All cohorts seem to show comparable average transaction behavior. The initial rise for the first four weeks can be explained by the distribution of activation dates over the 4 weeks of the month. Thereafter cohort behavior seems more or less stable. Visually, it would seem that cohort performance decreases from older to newer cohorts (November can be excluded since it’s a small cohort). However, the difference between them are not prominent enough to suggest some meaningful cause.

  • The median values show much greater variation indicating significant skew within the cohorts. This furthers makes the case that we should analyse these on the basis of acquisition channels. This might help explain which cohorts perform better and, crucially, provide actionable insights; for example, we can invest less in those acquisition channels that lead to these poorly performing users. This maybe also explained by differences in the demographic of users within the cohort, which I will explore in the succeeding sections.

Monthly Cohorts and App Opens

In this section I analyse the app usage behavior of these cohorts using similar plots as above.

From these plots we can observe the following:

  • The app usage behavior seems to spike in the first few weeks since activation and then stabilises. Here too we see a prominent drop in the last four weeks for each cohort. By comparing the average and the median values we can confirm that there is some skew within the cohort with some power users pulling the average up. Overall, this confirms most of the inferences from the transaction data.

  • From both the transactions and app usage plots, there is a suggestion that performance drops from earlier to later cohorts; the December cohort consistently performs better on all plots. This suggests that we should investigate more deeply the reasons behind this trend. As mentioned earlier, it will be worth while examining the acquisition channels and demographics to understand better why earlier monthly cohorts are performing better than latter months. Perhaps its merely due to users becoming more familiar with using Monzo’s products and features; we could test this hypotheses by looking at longer time scales for later cohorts.

Age Cohorts

In this section, I cohort users by age and investigate their usage behavior. First, I plot the distribution of ages from monzo_product.users to decide the cohort sizes and bins.

From these plots we can observe that Monzo’s user base is primarily students and young professionals; 50% of the user base is aged below 28 and 80% below 35 and the most common (mode) age of users is 24. The distribution has a long tail with a maximum age of 89 and this skews the average age to 29.87.

Based on these, I propose cohorting users in to 5 buckets with a bin size of 4 with the last bucket containing all those aged 35 and above. Below I plot the cohort sizes as per this division.

From the plot above we can observe that these cohorts are of comparable size, except for the 4th cohort and that is something to keep in mind if we were to notice high variance in aggregated measures of that cohort. These cohorts can also be usefully characterized, 18-22 would be students mostly, 23-26 can be understood as students/young professionals, 27-30 as young professionals, 31-34 as professionals with families possibly, and the final cohort constituting the tail of different ages above 34.

Age Cohorts and Transaction Data

In this section, I explore the transaction behavior for each age cohort.

Form these plots we can observe the following:

  • Cohort performance remains mostly stable in the first 24 weeks for all cohorts (except for a spike for the 35=< in average transaction amount). The largest cohort, [23 -27), seems to be the best performing cohort on all measures (although their mean transaction amount is lower than older cohorts, the median amount still ranks on top). This is a good sign since this cohort constitutes a plurality of users. This would suggest that Monzo is working well for its largest and (possibly) target user base.

  • The worst performing cohort seems to be those aged 35 and above. It is worth noting that this is a heterogeneous cohort and we must be careful about generalising. For example we can observe from the average and median plots that there is significant skew within the cohort with some users, probably younger ones, pushing up the average. Nonetheless, it indicates that these older users don’t find as much utility as younger users in transacting using Monzo, probably because they are still using the services of legacy banks. This might also point to an opportunity to better understand their needs and tailor products and features to their needs.

Age Cohorts and App Opens

In this section, I explore the frequency of app usage for each age cohort using similar plots.

From the plots above we can observe the following:

  • The trends reflect what we saw with the transactions data. There is marked difference between the 35>= cohort and the other cohorts with the latter lagging behind on most measures. In terms of app usage, the youngest cohort seems to be doing better than their performance with transactions. This can be explained by the fact that they have lesser disposable income compared to older cohorts. Save for the oldest cohort, between 60% and 70% of the users in each cohort seem to be using the app on a weekly basis. This is a good sign that the app is well designed and working well for them.

  • Once again, it might be worth investigating why the oldest cohort seems to be lagging behind the others. Perhaps this is because they have not fully understood the variety of features and/or the mechanics of using the app. Since this segment of users are usually wealthier, they may prove to be valuable customers for Monzo and we should consider investing in making it easier for them to use the app.

  • The difference between mean and median suggests that there is some skew in all the cohorts with some power users pulling the average up. It could be valuable to understand what is the cause of this skew through qualitative interviews and granular analyses of the data. This could give us valuable insights that could help improve the app to bridge the gap.

Conclusions

In this answer, I have demonstrated how cohort analysis can help us analyse with more granularity the difference in user behavior on Monzo. I have illustrated frameworks for cohort analysis by cohorting users by month of activation and age. I have analysed the performance of these cohorts on the basis of key actions on the platform and provided interpretations and recommendations. With these frameworks coded up, we can quickly identify under performing user segments and by considering both mean and median ensure that our conclusions are robust to skew. Another important cohort type that is out of scope of this analysis is user acquisition channels. We can use the same frameworks as the ones I demonstrated to understand better what type of acquisition channels provide the most valuable users.

Question 3: What are the segments of our customers that retain better than others. Can we learn anything useful for our product strategy from it?

In the last section, we already learned that customers in the age range [23 - 27) perform better on most retention metrics; users aged above 34 perform the worst. In the last section I also proposed some steps we can take to better understand the reasons behind this trend.

In this section, I will segment users by the features associated with them in the table monzo_product.users to see if there are any differences. I will first explore each feature individually and its association with retention metrics. In order to make these metrics comparable across users with different activation dates, I scale them for each user by the number of weeks they have been active on Monzo (excluding the last partially completed week of June). Finally I use linear regression modeling to check if there are any combined effects of these features on retention metrics.

Friends on Monzo

This field has significant data quality issues. For 23% of users the table has a NULL value, 37.73% have negative values, 6% have zero value, and the remaining 33% have non-zero and non-negative values. I deal with this by assuming zero friends wherever the value is NULL or negative.

We might hypothesise that user with more friends on Monzo use the app more since making payments to friends easily is a unique feature of Monzo. To test this hypothesis, I check the association between having zero friends and non-zero friends on retention metrics as well as the pearson correlation coefficients between having more friends and retention metrics, while excluding all user with zero friends.

##                  friends_on_monzo mean_week_m_t weeks_on_monzo
## friends_on_monzo       1.00000000    0.09880419    -0.03407144
## mean_week_m_t          0.09880419    1.00000000     0.04182814
## weeks_on_monzo        -0.03407144    0.04182814     1.00000000

##                  friends_on_monzo mean_week_amount
## friends_on_monzo        1.0000000        0.0177292
## mean_week_amount        0.0177292        1.0000000

##                     friends_on_monzo mean_week_app_opens
## friends_on_monzo          1.00000000         -0.03191394
## mean_week_app_opens      -0.03191394          1.00000000

It is evident from the above plots that having friends on Monzo is not associated with any meaningful difference in performance on retention metrics. It is once again worth noting that this maybe due to a data quality issue mentioned earlier; there is really not enough data of users with recorded friends on Monzo in the data set.

Overdraft Facility

I assume this as a categorical variable taking value 1 for a user offered overdraft and 0 for those not offered. 18.5% of the users have NULL value, which I change to zero. We might expect that users with this facility use Monzo more since they have greater spending power. Below I create boxplots to test the association between this variable and retention metrics to test this hypothesis.

There seems to be no statistically significant difference in retention metrics for those offered overdraft and those who were not.

Android Pay Activated

This feature is in the date format and I assume indicates the date when the user activated Android pay. Barely 4% of users have activated this feature.

We could hypothesise that a user who has activated this feature might transact more/ use the app more, both compared to users without this feature as well as their own usage before. To test the association with retention metrics, I consider both between user with this feature activated and not as well as a before and after within users. In the former case, I compare the performance on retention metrics for users with this feature activated and those without; for the latter I compare retention metrics before and after the feature was activated only for those users who activated this feature. I produce box plots to test these.

There seems to be no statistically significant difference in retention metrics for those with this feature and those without; there is also no statistically significant change before and after this feature was activated. It could be interesting to check if this holds for users who have activated Apple pay.

Profile Photo Upload

This is a date variable that encodes the date when a user uploaded profile photo to his account. Only 14.75% of users opted to upload a photo.

We could hypothesise that these users like the platform and wish to be identifiable since they plan on using it more. Like previously, I test both between user and within user (before/after). I test this below by plotting boxplots.

There seems to be no statistically significant difference in retention metrics for those with this feature and those without; there is also no statistically significant change before and after this feature was activated.

Regression Analysis to check association between retention metrics and all these features together

In this final section, I check whether there is any association with all these features taken together as the independent variables and retention metrics as the dependent variables. Since we found no association when we considered each feature independently, it is unlikely that we will find strong associations in this modeling exercise. Nonetheless, to illustrate its usage I build these models below and report the regression estimates.

Observations 57360
Dependent variable mean_week_m_t
Type OLS linear regression
F(5,57354) 777.49
0.06
Adj. R² 0.06
Est. S.E. t val. p
(Intercept) 6.96 0.09 80.78 0.00
age -0.07 0.00 -28.52 0.00
friends_on_monzo 0.10 0.00 32.79 0.00
od -0.33 0.05 -6.36 0.00
photo_ul 1.79 0.07 26.23 0.00
apay 3.55 0.12 29.20 0.00
Standard errors: OLS
Observations 57193
Dependent variable mean_week_amount
Type OLS linear regression
F(5,57187) 273.47
0.02
Adj. R² 0.02
Est. S.E. t val. p
(Intercept) 79.94 1.66 48.11 0.00
age 0.03 0.05 0.62 0.53
friends_on_monzo 1.34 0.06 22.63 0.00
od -9.52 1.01 -9.40 0.00
photo_ul 26.12 1.32 19.79 0.00
apay 42.67 2.35 18.16 0.00
Standard errors: OLS
Observations 57360
Dependent variable mean_week_app_opens
Type OLS linear regression
F(5,57354) 949.64
0.08
Adj. R² 0.08
Est. S.E. t val. p
(Intercept) 10.11 0.15 66.41 0.00
age -0.08 0.00 -18.38 0.00
friends_on_monzo 0.09 0.01 16.45 0.00
od -0.76 0.09 -8.17 0.00
photo_ul 6.84 0.12 56.64 0.00
apay 4.75 0.22 22.11 0.00
Standard errors: OLS

From these regression tables we can infer the following:

  • Although the p-values for most of the estimates are below significance level (0.05), the R2 and adjusted R2 for all three models are really low. The latter (multiplied by 100) maybe interpreted as what percentage of the variance in the dependent variable was explained by the model. For all three models this value is below 10, which indicates that they cannot explain most of the variation in the outcome variable. This is also why the regression estimates for most of the independent variable are low or close to zero. This is why the intercept term has the largest coefficient estimate in all three models. The p-values being below significance tells us that we are fairly certain that the estimates are not zero, even if very small.

  • With that caveat in mind, it is still worth analysing and interpreting the results. The two variables that have the largest regression estimates are profile photo upload and android pay activation. Although its not straight forward to compare the estimates for these categorical variables with those of continuous variables, this does seem to indicate that there is a positive effect on these features being activated and all three retention metrics. Perhaps we did not see a strong association when we considered them individually due to the imbalanced samples (very few users had these features activated). So it is worth collecting more data and investigating this further.

Conclusion

In this answer, I explored the associations between all the user feature variables and their association with retention metrics. When these were considered individually there was no statistically significant association between them and performance on retention metrics. On the other hand, the regression model, with caveats, suggests that there is a positive association for Android pay activation and profile Photo upload with performance on all three retention metrics. This requires further investigation before we can be certain.

Question 4: Are Monzo customers retained better over time?

The short answer to this question is a qualified yes. From the data made available to me, which were the first few months of the product being launched, user retention seems to be steadily increasing on all three metrics I had proposed. I detail below the qualifications to this answer: